Skip to content

fix(jax): structural defense against cached_property pytree/dict leaks#1302

Merged
Jammy2211 merged 1 commit into
mainfrom
feature/cached-property-pytree-guard
May 29, 2026
Merged

fix(jax): structural defense against cached_property pytree/dict leaks#1302
Jammy2211 merged 1 commit into
mainfrom
feature/cached-property-pytree-guard

Conversation

@Jammy2211
Copy link
Copy Markdown
Collaborator

Summary

Follow-up to #1300. The previous PR fixed a specific leak; this PR fixes the class of bug: every __dict__ walker on the model side uses an opt-out filter (underscore prefix + hardcoded blacklist), so any future @cached_property on a model class silently reproduces the same 38-script JAX failure mode.

Six sites in autofit/mapper/ are extended with a frozenset union of the class's cached_property names (sourced from PyAutoLabs/PyAutoConf#111's new cached_property_names helper):

File:line Site
autofit/mapper/model.py:86 AbstractModel.__getstate__
autofit/mapper/model.py:467 ModelInstance.dict
autofit/mapper/model_object.py:332 ModelObject._dict (feeds Collection.items)
autofit/mapper/prior_model/abstract.py:1280 AbstractModel.items
autofit/mapper/prior_model/collection.py:289 Collection._instance_for_arguments
autofit/mapper/prior_model/prior_model.py:497 Model._instance_for_arguments

A new classmethod AbstractModel._cached_property_names() delegates to the autoconf helper so the filter sites all read from a single source of truth.

Identifier-hash stability — verified

Critical for release: the unique_identifier hash (search output directory name, fit resume key, DB lookup) must not shift. Verified:

  1. The identifier (autofit/mapper/identifier.py:66-159) walks __dict__ independently — it does NOT call any of the 6 sites modified here.
  2. The identifier's own walker already filters _-prefix keys at line 137.
  3. Post-fix(jax): keep parameterization cache off ModelInstance + auto-register pytrees #1300, zero live @cached_property exists on any AbstractModel descendant in PyAutoFit/PyAutoArray/PyAutoGalaxy/PyAutoLens. The defense is purely forward-compat.
  4. Stress-test on three representative model shapes confirms byte-identical hashes pre/post defense:
    • Collection(gaussian=Model(Gaussian))f7f19073...
    • Nested Collection(a=Model, b=Collection(nested=Model))04e1328c...
    • Collection(Model(Gaussian, centre=(0.0, 0.0)))36084d2c...

Test plan

  • pytest test_autofit/mapper/test_parameterization.py -v — 13/13 pass (2 new tests + 11 existing).
  • pytest test_autofit -q — 1415/1415 pass (1413 prior + 2 new), 1 skipped (no regressions).
  • Identifier stability stress test on 3 model shapes — byte-identical strings pre/post.

Dependency

Depends on PyAutoLabs/PyAutoConf#111 (must merge first — provides the cached_property_names MRO walker).

A paired PyAutoArray PR ships the same defense on the pytree flatten side: PyAutoLabs/PyAutoArray#TBD.

🤖 Generated with Claude Code

PR #1300 fixed a specific leak where AbstractPriorModel.parameterization
(a `@functools.cached_property` added in commit 4564ae9) leaked its
cached string into every ModelInstance via
Collection._instance_for_arguments. That broke 38 JAX jit(fit_from)
calls and the autofit_workspace overview_1 smoke (clusters C1+C4).

The minimal fix renamed the cache key to `_parameterization_cache` so
the existing `_`-prefix filter at each `__dict__` iterator skipped it.
The structural problem remained: every walker uses an opt-out filter
(blacklist + underscore prefix), so any future cached_property declared
on a model class silently reproduces the same class of bug.

This PR closes the class:

- New classmethod `AbstractModel._cached_property_names(cls)` delegates
  to `autoconf.tools.decorators.cached_property_names` (PyAutoConf #111),
  returning a frozenset of every functools.cached_property and autoconf
  CachedProperty descriptor name in the MRO.
- Extend the filter at every `__dict__` iteration site to union the
  pre-existing exclusion with this frozenset:
    autofit/mapper/model.py            (__getstate__, ModelInstance.dict)
    autofit/mapper/model_object.py     (ModelObject._dict — feeds Collection.items)
    autofit/mapper/prior_model/abstract.py        (AbstractModel.items)
    autofit/mapper/prior_model/collection.py      (Collection._instance_for_arguments)
    autofit/mapper/prior_model/prior_model.py     (Model._instance_for_arguments)

Identifier-hash stability verified: the unique_identifier walker at
`autofit/mapper/identifier.py` does NOT call any of these 6 sites — it
walks `__dict__` independently with its own `_`-prefix filter. Three
representative model shapes (simple Collection, nested Collection,
Model with tuple arg) all produce byte-identical identifier hashes
pre- and post-defense:
  simple:     f7f19073a8fb19b3d11231fb6eef7e3b ✓
  nested:     04e1328c84a1e4c3a81a9d3544dd19f5 ✓
  with_tuple: 36084d2c3fec27e0b7aa504add0bd898 ✓

Tests:
- `test_cached_property_names_classmethod_walks_mro`: confirms the
  classmethod surfaces MRO-declared descriptors and memoises per-class.
- `test_cached_property_excluded_from_all_dict_walks`: ships a synthetic
  GuardedCollection with a cached_property returning a string; asserts
  the value never appears in instance.__dict__, instance.dict,
  model.items(), tree_flatten() leaves, __getstate__, or pickle
  round-trip.
- 1415/1415 PyAutoFit tests pass (1413 prior + 2 new).

Depends on: PyAutoLabs/PyAutoConf#111.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Jammy2211 Jammy2211 merged commit 76d6aef into main May 29, 2026
5 of 7 checks passed
@Jammy2211 Jammy2211 deleted the feature/cached-property-pytree-guard branch May 29, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant